Prediction Model for Lung Cancer in High-Risk Nodules Being Considered for Resection: Development and Validation in a Chinese Population

Background Determining benign and malignant nodules before surgery is very difficult when managing patients with pulmonary nodules, which further makes it difficult to choose an appropriate treatment. This study aimed to develop a lung cancer risk prediction model for predicting the nature of the nodule in patients’ lungs and deciding whether to perform a surgical intervention. Methods This retrospective study included patients with pulmonary nodules who underwent lobectomy or sublobectomy at Tianjin Medical University General Hospital between 2017 and 2020. All subjects were further divided into training and validation sets. Multivariable logistic regression models with backward selection based on the Akaike information criterion were used to identify independent predictors and develop prediction models. Results To build and validate the model, 503 and 260 malignant and benign nodules were used. Covariates predicting lung cancer in the current model included female sex, age, smoking history, nodule type (pure ground-glass and part-solid), nodule diameter, lobulation, margin (smooth, or spiculated), calcification, intranodular vascularity, pleural indentation, and carcinoembryonic antigen. The final model of this study showed excellent discrimination and calibration with a concordance index (C-index) of 0.914 (0.890–0.939). In an independent sample used for validation, the C-index for the current model was 0.876 (0.825–0.927) compared with 0.644 (0.559–0.728) and 0.681 (0.605–0.757) for the Mayo and Brock models. The decision curve analysis showed that the current model had higher discriminatory power for malignancy than the Mayo and the Brock models. Conclusions The current model can be used in estimating the probability of lung cancer in nodules requiring surgical intervention. It may reduce unnecessary procedures for benign nodules and prompt diagnosis and treatment of malignant nodules.


INTRODUCTION
Lung cancer is the most common malignancy in the world and is the highest cause for cancer mortality (1). With a very poor prognosis, the 5-year survival rate for lung cancer is only 19.7% (2,3) despite recent improvements (4). According to the eighth edition of the TNM staging of lung cancer published by the International Association for the Study of Lung Cancer, 80% of patients with stage IA non-small-cell lung cancer (NSCLC) are alive for ≥5 years after diagnosis. However, this proportion drops to <10% in patients with stage IV disease (5). The poor survival of patients with lung cancer may primarily be due to the fact that the majority of patients are diagnosed at an advanced stage (6).
Based on the findings of the National Lung Screening Trial (7), computed tomography (CT) or low-dose CT (LDCT) has been recommended as an effective tool for lung cancer screening in many countries or regions (8)(9)(10)(11). Although CT or LDCT helps detect lung cancer at an early stage, the majority of pulmonary nodules (PNs) detected by CT are benign (7). Identifying malignant PNs from benign ones has become a challenge for clinicians, and follow-up examinations (e.g., follow-up scans and invasive biopsies) may lead to additional costs or harm the patient (12). In recent decades, several lung cancer risk prediction models based on radiological characteristics and clinical information have been developed to assist clinicians in managing patients with pulmonary nodules (13)(14)(15)(16)(17)(18). These models have demonstrated a high value in discriminating independent cohorts. Moreover, some of them were recommended by guidelines for the classification of high-and low-risk pulmonary nodules (11).
However, most of these models were built on initial CT plain or LDCT scans and were used at the baseline. However, the diagnostic performance of models may be inaccurate within dissimilar populations. Clinicians rarely recommend performing an invasive procedure in patients with PNs after their initial scan. Consequently, a period of observation for PNs often exists before they make a decision. Having a tool accurate enough to assist clinicians in judging would be clinically useful to help avoid overdiagnosis and facilitate early diagnosis before deciding on an invasive procedure. Different from the previously reported models, the subjects of this study were those with PNs that were highly suspected by clinicians to be lung cancer (all of these patients underwent surgery).

Study Population
The training database included a retrospective sample of patients with at least one pulmonary nodule diameter ranging from 5 to 30 mm on CT lung window with a definitive histopathologic diagnosis by surgery at Tianjin Medical University General Hospital between 2017 and 2019. Individuals with atelectasis, obstructive pneumonia, or pleural effusion on CT; ongoing antitumor therapy; preoperative non-surgical histopathologic diagnosis; history of lung cancer diagnosis; history of pulmonary surgery; pulmonary metastatic disease; and age < 18 years were excluded. Patient and clinicopathologic characteristics were collected through chart review and electronic medical records. A malignant or benign diagnosis was established by pathologic tissue examination via complete nodule resection or the lobe it resides (including lobectomy and sublobectomy). The validation dataset included individuals with the same criteria diagnosed between 2019 and 2020 and was collected independently of the training cohort. There were 785 patients who met the inclusion criteria. Twenty-two patients were excluded because of the lack of CT data. Eventually, 763 patients were enrolled in this study. The model training set included the contrast-enhanced preoperative CT images of the patients.
Conventional radiologic staging before surgery generally includes contrast-enhanced CT of the chest and abdomen, emission computed tomography of bone, and magnetic resonance imaging of brain. Clinical data collection, shown in Tables 1-3 according to lung cancer status, included clinical characteristics, radiographic PN characteristics, and serum tumor markers [carcinoembryonic antigen (CEA), cytokeratin fraction 21-1 (CYFRA 21-1), squamous cell carcinoma antigen (SCC), and neuron-specific enolase (NSE)]. Clinical characteristics included sex, age at diagnosis, smoking history, cancer history other than lung cancer, and family history of lung cancer. Two experienced thoracic radiologists identified and characterized PNs according to lobar location, size (long-axis diameter), presence (e.g., spiculation, calcification, and lobulation), and type (ground-glass, part-solid, or solid nodules). Nodules would be characterized as multiple if more than one similar nodule exist and are considered to be

Statistical Analysis
Descriptive statistics were used to describe the characteristics of the patient cohorts. Continuous data were expressed as means ± standard deviation or median with interquartile ranges and were compared between groups using the Student's t-test or the Mann-Whitney U test, as appropriate. Categorical data were given as counts and percentages and were analyzed using Pearson c 2 tests. Binomial logistic regression models were used, and the Akaike information criterion values were applied to determine which combinations of model predictors best explain the data. Model performance was assessed using estimates of discrimination (ability to classify benign and malignant PNs) and calibration (how well probabilities predicted by the model agree with actual observed risk). The Harrell C-index measures discrimination and is corrected using 1,000 bootstrap resamples (19). Calibration was assessed by plotting the subtraction of actual (Kaplan-Meier method) and predicted survival probabilities of malignancy (20,21). The area under the receiver operating characteristic curve (AUC) values and decision curve analysis (DCA) (22) were used to assess the diagnostic performance of all models. All analyses were twotailed at a significance level of p < 0.05. All statistics were performed with R version 4.0.3 (The R Foundation for Statistical Computing) and SPSS version 23 for Windows.

Clinical and Nodule Characteristic
The patients in the malignant group were older (57.6 ± 10.5 vs. 63.0 ± 8.6, p < 0.001), and malignant nodules were more frequent in females than males (55.4% vs. 44.6%; p = 0.075). Of the patients, 213 (37.8%) and 36 (5.7%) were current or former smokers and had a history of extrathoracic cancer, respectively. Moreover, 112 (19.9%) patients had a history of chronic obstructive pulmonary disease (COPD) or radiographic evidence of emphysema. The clinical characteristics of patients are shown in Table 1.
Of the patients, 197 (35.0%) had at least one tumor marker elevated at diagnosis, and 138 of them were malignant. Median CEA and CYFRA 21-1 in malignant nodules were significantly (p < 0.05) higher than those in benign nodules. The serum tumor markers of patients are summarized in Table 3.

Predictive Model
In the final multivariate logistic regression model (M1), the diagnosis of cancer in a nodule was associated with sex, age at diagnosis, smoking history, lymphadenopathy, vacuole or air bronchogram, nodule type (pure ground-glass and part-solid), nodule diameter, lobulation, margin (smooth, spiculated, or none of these), calcification, intranodular vascularity, pleural indentation, and CEA Table 4. M1 showed a highly discriminant ability with a C-index of 0.914 (0.890-0.939) and 0.906 (0.885-0.927) by internal validation with 1,000 times bootstrap resampling and adjustment for optimism. Moreover, the calibration curve for the model is plotted in Figure 1.

Model Comparison in the Validation Cohort
In the external validation cohort (Figure 2), the diagnostic performance between M1, M1b (M1 without serum tumor markers), Mayo model, and Brock model was compared using AUC, (95% CI). For M1, M1b, Mayo model, and Brock model, the AUC was 0.876 (0.825-0.927), 0.877 (0.827-0.927), 0.644 (0.559-0.728), and 0.681 (0.605-0.757), respectively. The discrimination performance of the current model was significantly better than that of the Mayo (p < 0.01) or Brock (p < 0.01) models. Notably, the multivariate logistic regression analyses showed that CEA was the independent predictor of malignant nodules, but M1 was not superior to M1b in external validation.
A decision curve (22) was plotted to compare the benefit of these three models, and these results were put in a clinical context (Figure 3). The net benefit of M1 was better than either the Mayo or Brock models for all threshold probabilities of >10% in clinical settings. Thus, patients whose cancer risk was  approximately one in 10 or higher and who receive surgery would benefit from the current model. The density distribution of the predicted probability score on the validation cohort of three models is shown in Figure 4. The M1 score was >75% for 79% of individuals with malignant PNs, whereas subjects with benign PNs tend to be distributed. In contrast, the Mayo or Brock models have insignificant concentration trends.

DISCUSSION
Early detection and accurate diagnosis are effective ways to lower lung cancer mortality. Given the occult onset, CT screening may be currently the preferred test for early diagnosis and management of clinically significant lung nodules. However, the optimal target PNs and the timing of biopsy remain uncertain (23). The American College of Chest Physicians (CHEST) guidelines for lung cancer screening (version 2021) summarized the results of 17 clinical trials and revealed that 22.0% of surgeries were performed for benign diseases (ranged from 8% to 39%) (24). How to reduce benign resection without delaying the diagnosis of lung cancer has become a research hotspot. This evidence-based, retrospective project established a malignancy risk prediction model to reassess the PNs that clinicians considered need to be biopsied. This study reviewed data from 763 subjects diagnosed with lung nodules that were clinically considered to be highly malignant who underwent surgical resection in between 2017 and 2020. Except for a few confirmed benign diseases, most nodules were considered to be malignant preoperatively. Despite the received observation and intervention recommended in the guidelines (11,25,26) before surgery, nearly one in three nodules remained benign. The current initial M1, built with all predictors, showed excellent predictive accuracy (with an AUC of 0.876 in an external validation cohort) and calibration ( Figure 1). M2 was built because of the difference in the distributions of benign and malignant lesions in three nodule densities. However, M2 did not perform better in classifying solid nodules in the validation cohort (AUC, 0.904 vs. 0.896) than M1. Serum tumor markers did not prove to be a strong predictor as anticipated in the multivariate analyses. Thus, the M1b model was built to exclude tumor markers. In the validation data, which tend to be lower tumor markers levels even when malignant, M1 did not perform better than M1b. Even if CEA levels show differences between benign and malignant nodules, the effectiveness of tumor markers in the classification of PNs needs further verification.
Smoking is a risk factor for lung cancer (13,14,27). The smoking rate of malignant cohort in this study was 37% which was much lower than that in other studies, especially screenbased studies (15,28,29). Moreover, smoking history was an independent predictor for lung cancer in the current final multivariable model although no difference was demonstrated in the groups. The smoking prevalence in the current study may be lower greatly because of the varying smoking habits in the male and female populations (30). Females had a lower smoking prevalence than males in this study (10.2% vs. 68.2%, p < 0.001). Moreover, females were significantly associated with malignant PNs, which agrees with previous studies (15,16,31). Emphysema or COPD had been noticed to increase the risk of lung cancer (32), but it was not observed in this study. An intranodular vascularity was found to strongly correlate with lung cancer risk, which is consistent with the theory of tumor angiogenesis (33). Malignancy proportion was more frequent in subsolid nodules than in solid nodules because most subsolid nodules resected in this study were monitored until change in follow-up CT features. However, this process may exclude some benign lesions. Changes in CT image of subsolid PNs suggest malignancy (34,35). Although the largest in diameter did not mean the highest probability of malignancy (15), similar to previous studies (13,   14, 18), malignancies were more often found in bigger nodules in our study (17 mm vs. 14 mm, p = 0.001). Other risk factors for earlier lung cancer differential diagnosis (e.g., nodules with spiculation, lobulation, calcification, or pleural indentation) were also significantly associated with lung cancer in this study (36)(37)(38).
Unlike previous models (13)(14)(15)(16)(17)(18), the current model was determined following the preoperative contrast-enhanced CT scan and serum tumor markers. In the external validation set, the AUC for the current models was 0.876 compared with 0.644 and 0.683 for the Mayo and Brock models ( Figure 2). These models were also compared using the decision curve (Figure 3), which showed that the current model had higher discriminatory power for malignancy than the Mayo or the Brock model. The density distribution of the predicted probability score of these models on the validation set was plotted to figure out whether these differences would be helpful in the clinical management of patients with PNs with a risk that is high enough to have an invasive procedure (Figure 4). The current model classified 79% and 2% of malignant nodules at a probability threshold of ≥0.75 and ≤0.25, respectively. In comparison, the Mayo and Brock models have skewed score distributions for all PNs. Although the current model gave values for discrimination that outperforms the Mayo or the Brock model, they cannot be directly compared because accuracy can considerably vary within populations (39). The malignancy proportion of the Mayo (23.2%) and Brock (5.5%) models is much lower than that of the patients whose PNs were suspected to be malignant after observation recommended by guidelines. The models derived from the populations with a low prevalence of malignancy may underestimate the risk when used in the high-prevalence populations. Therefore, we suggest that medical centers could develop models according to their local populations to help with the clinical management of PNs, instead of directly applying some screening models. The current model is more suitable for reassessment for patients who were admitted for planned surgery or biopsy. The proportion of malignant and benign nodules in the density distribution of the predicted probability of the current model may be helpful in clinical decision-making given the pros and cons of observation, biopsy, or surgery ( Figure 4).
This study has several limitations. First, the history of previous imaging follow-up of the patient cohort was incomplete as ours was a tertiary referral center. Therefore, this study was unable to evaluate the effect of temporal nodule evolution. Moreover, there was a lack of uniform criteria for suspicion of malignancy, and they were determined based on the subjective judgment of thoracic surgeons. Furthermore, the time point to split the data into study and validation cohorts was used to limit the effect of overfitting. The current model may not perform as well in other study populations. Second, this study failed to build a model exclusively for subsolid nodules. The proportion of benign lesions was only 1 in 10 for subsolid nodules in this study and was too low to perform a multivariate logistic regression. The most likely explanation is that the subsolid nodules included in this study were all observed until they change in follow-up CT features. The changes were suspected to demonstrate usefulness in discriminating benign from malignant nodules. Unfortunately, however, we failed to sum up the period. Lastly, this study was not able to examine nodule classification models that incorporated other factors associated with lung cancer risk [i.e., positron emission tomography-CT (40) and nodule volume (16,41)] due to the lack of such data.

CONCLUSIONS
This study developed and externally validated a risk model for estimating the probability of lung cancer in PNs that were recommended to have invasive interventions. The model could be considered before more invasive treatments to justify the necessity. Established by using readily available clinical

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the institutional ethics committee board of Tianjin Medical University General Hospital (No. IRB2019-KY-153), and individual consent for this retrospective analysis was waived.